Robust ASR model adaptation by feature-based statistical data mapping
نویسندگان
چکیده
Automatic speech recognition (ASR) model adaptation is important to many real-life ASR applications due to the variability of speech. The differences of speaker, bandwidth, context, channel and et al. between speech databases of initial ASR models and application data can be major obstacles to the effectiveness of ASR models. ASR models, therefore, need to be adapted to the application environments. Maximum Likelihood Linear Regression (MLLR) is a popular model-based method mainly used for speaker adaptation. This paper proposes a feature-based statistical Data Mapping (SDM) approach, which is more flexible than MLLR in various applications, such as different bandwidth and context. Experimental results on the TIMIT database show that ASR models adapted by the SDM approach have improved accuracy.
منابع مشابه
Noise adaptation for robust AURORA 2 noisy digit recognition using statistical data mapping
The mismatch between system training and operating conditions often has negative influences on automatic speech recognition (ASR) systems. Noise in the operating environments is commonly encountered. ASR model adaptation is an important way to enhance the system performance in noisy environments. This paper proposes a feature-based statistical data mapping (SDM) approach for robust noisy digit ...
متن کاملMultilingual speech recognition A posterior based approach
Modern automatic speech recognition (ASR) systems are based on parametric statistical models such as hidden Markov models (HMMs), exploiting 1) acoustic-phonetic models, which need to be trained on large amount of acoustic data, 2) a language model, which needs to be trained on large amount of text data and, finally, 3) a lexicon with phonetic transcription which requires linguistic expertise. ...
متن کاملStatistical Adaptation of Acoustic M for Robust Speech Re
Noise degrades the performance of Automatic Speech Recognition (ASR) systems working in real condition. The mismatch between the training and recognition conditions is considered the main factor involved in this degradation, and most methods for robust ASR are focussed on its minimization. In this work, we compare robust methods for ASR based on (a) the compensation of the noise effects and (b)...
متن کاملDeep neural network based spectral feature mapping for robust speech recognition
Automatic speech recognition (ASR) systems suffer from performance degradation under noisy and reverberant conditions. In this work, we explore a deep neural network (DNN) based approach for spectral feature mapping from corrupted speech to clean speech. The DNN based mapping substantially reduces interference and produces estimated clean spectral features for ASR training and decoding. We expe...
متن کاملRobust Asr in Reverberant Environments Using Temporal Cepstrum Smoothing for Speech Enhancement and an Amplitude Modulation Filterbank for Feature Extraction
This paper presents techniques aiming at improving automatic speech recognition (ASR) in single channel scenarios in the context of the REVERB (REverberant Voice Enhancement and Recognition Benchmark) challenge. System improvements range from speech enhancement over robust feature extraction to model adaptation and word-based integration of multiple classifiers. The selective temporal cepstrum ...
متن کامل